Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback
نویسندگان
چکیده
We study a stochastic online problem of learning to influence in a social network with semi-bandit feedback, individual observations of how influenced users influence others. Our problem combines challenges of partial monitoring, because the learning agent only observes the influenced portion of the network, and combinatorial bandits, because the cardinality of the feasible set is exponential in the maximum number of influencers. We propose a computationally efficient UCBlike algorithm for solving our problem, IMLinUCB, and analyze it on forests. Our regret bounds are polynomial in all quantities of interest; reflect the structure of the network; and do not depend on inherently large quantities, such as the reciprocal of the minimum probability of being influenced and the cardinality of the action set. To the best of our knowledge, these are the first such results. IMLinUCB permits linear generalization and therefore is suitable for large-scale problems. We evaluate IMLinUCB on several synthetic problems and observe that the regret of IMLinUCB scales as suggested by our upper bounds. A special form of our problem can be viewed as a linear bandit and we match the regret bounds of LinUCB in this case.
منابع مشابه
Model-Independent Online Learning for Influence Maximization
We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of “seed” users to expose the product to. While prior work assumes a known model of information diffusion, we propose a novel parametrization that not only makes our framework agnostic to the underlying diffusion model, but also sta...
متن کاملDiffusion Independent Semi-Bandit Influence Maximization
We consider influence maximization (IM) in social networks, which is the problem of maximizing the number of users that become aware of a product by selecting a set of “seed” users to expose the product to. While prior work assumes a known model of information diffusion, we propose a parametrization in terms of pairwise reachability which makes our framework agnostic to the underlying diffusion...
متن کاملTighter Regret Bounds for Influence Maximization and Other Combinatorial Semi-Bandits with Probabilistically Triggered Arms
We study combinatorial multi-armed bandit with probabilistically triggered arms and semi-bandit feedback (CMAB-T). We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of 1/p, where p is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability mod...
متن کاملImproving Regret Bounds for Combinatorial Semi-Bandits with Probabilistically Triggered Arms and Its Applications
We study combinatorial multi-armed bandit with probabilistically triggered arms and semi-bandit feedback (CMAB-T). We resolve a serious issue in the prior CMAB-T studies where the regret bounds contain a possibly exponentially large factor of 1/p∗, where p∗ is the minimum positive probability that an arm is triggered by any action. We address this issue by introducing a triggering probability m...
متن کاملCombinatorial Multi-armed Bandit with Probabilistically Triggered Arms: A Case with Bounded Regret
In this paper, we study the combinatorial multi-armed bandit problem (CMAB) with probabilistically triggered arms (PTAs). Under the assumption that the arm triggering probabilities (ATPs) are positive for all arms, we prove that a class of upper confidence bound (UCB) policies, named Combinatorial UCB with exploration rate κ (CUCB-κ), and Combinatorial Thompson Sampling (CTS), which estimates t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017